NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Longitudinal Omics Data Analysis: A Review on Models, Algorithms, and Tools

Taheriyoun, Ali R; Ross, Allen; Safikhani, Abolfazl; Soudbakhsh, Damoon; Rahnavard, Ali (June 2025, arxiv)

Longitudinal omics data (LOD) analysis is essential for understanding the dynamics of biological processes and disease progression over time. This review explores various statistical and computational approaches for analyzing such data, emphasizing their applications and limitations. The main characteristics of longitudinal data, such as imbalancedness, high-dimensionality, and non-Gaussianity are discussed for modeling and hypothesis testing. We discuss the properties of linear mixed models (LMM) and generalized linear mixed models (GLMM) as foundation stones in LOD analyses and highlight their extensions to handle the obstacles in the frequentist and Bayesian frameworks. We differentiate in dynamic data analysis between time-course and longitudinal analyses, covering functional data analysis (FDA) and replication constraints. We explore classification techniques, single-cell as exemplary omics longitudinal studies, survival modeling, and multivariate methods for clinical/biomarker-based applications. Emerging topics, including data integration, clustering, and network-based modeling, are also discussed. We categorized the state-of-the-art approaches applicable to omics data, highlighting how they address the data features. This review serves as a guideline for researchers seeking robust strategies to analyze longitudinal omics data effectively, which is usually complex.
more » « less
Free, publicly-accessible full text available June 11, 2026
A General Modeling Framework for Network Autoregressive Processes

https://doi.org/10.1080/00401706.2023.2203184

Yin, Hang; Safikhani, Abolfazl; Michailidis, George (May 2023, Technometrics)

Full Text Available
Spatio-temporal modeling of parcel-level land-use changes using machine learning methods

https://doi.org/10.1016/j.scs.2023.104390

Tepe, Emre; Safikhani, Abolfazl (March 2023, Sustainable Cities and Society)

Full Text Available
Multiple Change Point Detection in Reduced Rank High Dimensional Vector Autoregressive Models

https://doi.org/10.1080/01621459.2022.2079514

Bai, Peiliang; Safikhani, Abolfazl; Michailidis, George (February 2023, Journal of the American Statistical Association)

Full Text Available
Fast and Scalable Algorithm for Detection of Structural Breaks in Big VAR Models

https://doi.org/10.1080/10618600.2021.1950005

Safikhani, Abolfazl; Bai, Yue; Michailidis, George (January 2023, Journal of Computational and Graphical Statistics)

Many real time series datasets exhibit structural changes over time. A popular model for capturing their temporal dependence is that of vector autoregressions (VAR), which can accommodate structural changes through time evolving transition matrices. The problem then becomes to both estimate the (unknown) number of structural break points, together with the VAR model parameters. An additional challenge emerges in the presence of very large datasets, namely on how to accomplish these two objectives in a computational efficient manner. In this article, we propose a novel procedure which leverages a block segmentation scheme (BSS) that reduces the number of model parameters to be estimated through a regularized least-square criterion. Specifically, BSS examines appropriately defined blocks of the available data, which when combined with a fused lasso-based estimation criterion, leads to significant computational gains without compromising on the statistical accuracy in identifying the number and location of the structural breaks. This procedure is further coupled with new local and exhaustive search steps to consistently estimate the number and relative location of the break points. The procedure is scalable to big high-dimensional time series datasets with a computational complexity that can achieve O(n), where n is the length of the time series (sample size), compared to an exhaustive procedure that requires steps. Extensive numerical work on synthetic data supports the theoretical findings and illustrates the attractive properties of the procedure. Finally, an application to a neuroscience dataset exhibits its usefulness in applications. Supplementary files for this article are available online.
more » « less
Full Text Available
Regularized Estimation in High-Dimensional Vector Auto-Regressive Models Using Spatio-Temporal Information

https://doi.org/10.5705/ss.202020.0056

Wang, Zhenzhong; Safikhani, Abolfazl; Zhu, Zhengyuan; Matteson, David S (January 2023, Statistica Sinica)

Full Text Available
Machine learning application to spatio-temporal modeling of urban growth

https://doi.org/10.1016/j.compenvurbsys.2022.101801

Kim, Yuna; Safikhani, Abolfazl; Tepe, Emre (June 2022, Computers, Environment and Urban Systems)

Full Text Available
Hybrid Modeling of Regional COVID-19 Transmission Dynamics in the U.S.

https://doi.org/10.1109/JSTSP.2022.3140703

Bai, Yue; Safikhani, Abolfazl; Michailidis, George (February 2022, IEEE Journal of Selected Topics in Signal Processing)

Full Text Available
Multiple Change Points Detection in Low Rank and Sparse High Dimensional Vector Autoregressive Models

https://doi.org/10.1109/TSP.2020.2993145

Bai, Peiliang; Safikhani, Abolfazl; Michailidis, George (January 2020, IEEE Transactions on Signal Processing)

Full Text Available

Search for: All records